Back Transliteration from Japanese to English using Target English Context
نویسندگان
چکیده
This paper proposes a method of automatic back transliteration of proper nouns, in which a Japanese transliterated-word is restored to the original English word. The English words are created from a sequence of letters; thus our method can create new English words that are not registered in dictionaries or English word lists. When a katakana character is converted into English letters, there are various candidates of alphabetic characters. To ensure adequate conversion, the proposed method uses a target English context to calculate the probability of an English character or string corresponding to a Japanese katakana character or string. We confirmed the effectiveness of using the target English context by an experiment of personal-name back transliteration.
منابع مشابه
Automatic Transliteration and Back-transliteration by Decision Tree Learning
Automatic transliteration and back-transliteration across languages with drastically different alphabets and phonemes inventories such as English/Korean, English/Japanese, English/Arabic, English/Chinese, etc, have practical importance in machine translation, crosslingual information retrieval, and automatic bilingual dictionary compilation, etc. In this paper, a bi-directional and to some exte...
متن کاملExtracting Transliteration Pairs from Comparable Corpora
Transliterating words and names from one language to another is a frequent and highly productive phenomenon. For example, English word cache is transliterated in Japanese asキャッシュ “kyasshu”. In many cases, recent transliterations are not recorded in machine readable dictionaries so it is impossible to rely on dictionary lookup to find transliteration equivalents. In this paper we describe a meth...
متن کاملTransliteration Considering Context Information based on the Maximum Entropy Method
This paper proposes a method of automatic transliteration from English to Japanese words. Our method successfully transliterates an English word not registered in any bilingual or pronunciation dictionaries by converting each partial letters in the English word into Japanese katakana characters. In such transliteration, identical letters occurring in different English words must often be conver...
متن کاملA Statistical Approach to Chinese-to-English Back-Transliteration
This paper describes a statistical approach for modeling Chinese-to-English back-transliteration. Unlike previous approaches, the model does not involve the use of either a pronunciation dictionary for converting source words into phonetic symbols or manually assigned phonetic similarity scores between source and target words. The parameters of the proposed model are automatically learned from ...
متن کاملSyllable-based Machine Transliteration with Extra Phrase Features
This paper describes our syllable-based phrase transliteration system for the NEWS 2012 shared task on English-Chinese track and its back. Grapheme-based Transliteration maps the character(s) in the source side to the target character(s) directly. However, character-based segmentation on English side will cause ambiguity in alignment step. In this paper we utilize Phrase-based model to solve ma...
متن کامل